Skip to content

feat: support structured outputs (response_format) in chat completions#43

Open
giwaov wants to merge 5 commits intoOpenGradient:mainfrom
giwaov:feat/structured-outputs
Open

feat: support structured outputs (response_format) in chat completions#43
giwaov wants to merge 5 commits intoOpenGradient:mainfrom
giwaov:feat/structured-outputs

Conversation

@giwaov
Copy link
Copy Markdown
Contributor

@giwaov giwaov commented Mar 23, 2026

Summary

Implements OpenAI-compatible structured outputs support by wiring the response_format parameter through the chat completion pipeline, as requested in #14.

Changes

tee_gateway/controllers/chat_controller.py

  • Non-streaming path (_create_non_streaming_response): After tool binding, checks response_format. If the type is json_object or json_schema, binds it to the LangChain model via model.bind(response_format=...). The text type is a no-op (default behavior).
  • Streaming path (_create_streaming_response): Identical logic applied after tool binding.
  • TEE hash dict (_chat_request_to_dict): Includes response_format in the canonical serialized dict so the TEE signature covers the requested output format.

tests/test_structured_outputs.py

14 unit tests covering:

  • Parsing response_format from request dicts (text, json_object, json_schema, and absent)
  • Inclusion in the TEE hash dict (presence, absence, determinism, differentiation)
  • Model binding behavior (json_object binds, json_schema binds full schema, text does not bind, absent does not bind)
  • Interaction with tool calling (both bind_tools and bind(response_format=...) chain correctly)

Design Decisions

  • No changes to llm_backend.py: The response_format is bound per-request via model.bind() after retrieving the cached model, following the same pattern already used for tool binding. This keeps the LRU cache clean (keyed only on model/temperature/max_tokens).
  • Pass-through approach: The response_format dict is forwarded as-is to LangChain, which handles provider-specific translation. This maintains OpenAI API compatibility and works with all supported providers (OpenAI, Anthropic, Google, xAI).

Supported Formats

Per the OpenAPI spec already defined in the repo:

  • {type: text} plain text (default, no-op)
  • {type: json_object} JSON mode
  • {type: json_schema, json_schema: {name: ..., schema: {...}, strict: true}} strict schema-constrained output

Closes #14

Wire the OpenAI-compatible response_format parameter through the chat
completion pipeline:

- Bind response_format to LangChain model via model.bind() for
  json_object and json_schema types (text is a no-op)
- Apply to both streaming and non-streaming code paths
- Include response_format in the canonical request dict so TEE
  hashing covers the requested output format
- Add 14 unit tests covering parsing, hash-dict serialization,
  model binding, and interaction with tool calling

Closes OpenGradient#14
@adambalogh adambalogh requested a review from kylexqian March 23, 2026 21:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds OpenAI-compatible structured outputs support to the chat completions pipeline by threading the response_format request parameter through to LangChain model invocation and including it in the TEE request hash.

Changes:

  • Bind response_format onto the LangChain chat model (non-streaming + streaming) via model.bind(response_format=...) for non-text formats.
  • Include response_format in the canonical _chat_request_to_dict(...) used for deterministic TEE hashing/signing.
  • Add a new unit test module covering parsing, hashing inclusion/determinism, and non-streaming model binding behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
tee_gateway/controllers/chat_controller.py Wires response_format into both non-streaming and streaming invocation flows and into the canonical request hash dict.
tests/test_structured_outputs.py Adds unit tests for request parsing, hash dict inclusion, and non-streaming bind behavior (including tool binding interaction).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@kylexqian
Copy link
Copy Markdown
Collaborator

Hello! @giwaov

I tested this branch and ran into two issues:

  • json_schema requests were getting rejected with a 400 — needed to drop type from required in the OpenAPI spec for ResponseFormatJsonSchema_json_schema
  • Anthropic models error when using bind(response_format=...) — needs to go through with_structured_output(schema, method="json_schema") instead, with the schema name injected as title

Turns out Anthropic doesn't support json_body through langchain anyways...

I also added streaming tests and fixed gpt-4o → gpt-4.1 in the existing tests.

Fix is here: 9b5278b

You can pull it in with:

git fetch origin feat/structured-outputs
git cherry-pick 9b5278b

Please pull it and push onto this PR. Alternatively, I'll just open a PR from my commit and merge that way.

Thanks again for your contribution!

…ng tests

- Route Anthropic json_schema requests through with_structured_output()
  instead of bind(), which Anthropic does not support for response_format.
  Raise a clear error for json_object (no Anthropic native equivalent).
- Inject the json_schema wrapper 'name' as 'title' in the schema dict so
  LangChain-Anthropic can derive a function name for its tool-use mechanism.
- Handle Anthropic structured output in the streaming path by invoking
  synchronously and emitting the result as a single SSE content chunk.
- Fix OpenAPI spec: remove 'type' from required in ResponseFormatJsonSchema
  so json_schema requests pass connexion validation.
- Fix pre-existing test breakage: gpt-4o -> gpt-4.1 (model removed from registry).
- Add streaming tests: binding behaviour for all providers, Anthropic SSE
  chunk output, and TEE hash content correctness.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@giwaov
Copy link
Copy Markdown
Contributor Author

giwaov commented Apr 2, 2026

Hey @kylexqian, thanks for testing and catching those issues! Cherry-picked your fix (9b5278b) — all 24 tests passing locally, ruff format and lint clean. Pushed.

@kylexqian
Copy link
Copy Markdown
Collaborator

Hello again @giwaov,

Thanks for the quick response! Seems like some lint error got through, I also addressed one of the copilot comments. Do you mind cherry picking these two commits and adding them to your PR? Should hopefully be the last ones!

git cherry-pick 292f432
git cherry-pick 58e83fa

kylexqian
kylexqian previously approved these changes Apr 2, 2026
Copy link
Copy Markdown
Collaborator

@kylexqian kylexqian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the cherry-pick commits that fix a couple bugs + add tests. Other than that, looks good!

kylexqian and others added 2 commits April 2, 2026 11:47
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add _normalize_response_format() helper that coerces response_format to
a plain dict regardless of whether it arrives as a dict, Pydantic model,
or other object. Apply it in both streaming/non-streaming binding paths
and in the TEE hash dict — preventing silent json_schema payload loss
when rf_dict was reconstructed as only {"type": ...}, and preventing a
potential json.dumps failure in _chat_request_to_dict on non-dict input.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@giwaov
Copy link
Copy Markdown
Contributor Author

giwaov commented Apr 2, 2026

Cherry-picked both (292f432 and 58e83fa). 24 tests passing, ruff clean. Pushed!

@giwaov
Copy link
Copy Markdown
Contributor Author

giwaov commented Apr 2, 2026

Glad this is coming together! This has been a fun one to work on. With this and the other merged PRs, I've been spending a lot of time in the codebase and would love to get the Alpha OG role on Discord if that's something you guys can help with. happy to keep contributing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for structured outputs

3 participants